702 research outputs found

    Ratings are overrated!

    Get PDF
    Are ratings of any use in human–computer interaction and user studies at large? If ratings are of limited use, is there a better alternative for quantitative subjective assessment? Beyond the intrinsic shortcomings of human reporting, there are a number of supplementary limitations and fundamental methodological flaws associated with rating-based questionnaires – i.e., questionnaires that ask participants to rate their level of agreement with a given statement, such as a Likert item. While the effect of these pitfalls has been largely downplayed, recent findings from diverse areas of study question the reliability of using ratings. Rank-based questionnaires – i.e., questionnaires that ask participants to rank two or more options – appear as the evident alternative that not only eliminates the core limitations of ratings but also simplifies the use of sound methodologies that yield more reliable models of the underlying reported construct: user emotion, preference, or opinion. This paper solicits recent findings from various disciplines interlinked with psychometrics and offers a quick guide for the use, processing, and analysis of rank-based questionnaires for the unique advantages they offer. The paper challenges the traditional state-of-practice in human–computer interaction and psychometrics directly contributing toward a paradigm shift in subjective reporting.peer-reviewe

    Learning deep physiological models of affect

    Get PDF
    Feature extraction and feature selection are crucial phases in the process of affective modeling. Both, however, incorporate substantial limitations that hinder the development of reliable and accurate models of affect. For the purpose of modeling affect manifested through physiology, this paper builds on recent advances in machine learning with deep learning (DL) approaches. The efficiency of DL algorithms that train artificial neural network models is tested and compared against standard feature extraction and selection approaches followed in the literature. Results on a game data corpus — containing players’ physiological signals (i.e. skin conductance and blood volume pulse) and subjective self-reports of affect — reveal that DL outperforms manual ad-hoc feature extraction as it yields significantly more accurate affective models. Moreover, it appears that DL meets and even outperforms affective models that are boosted by automatic feature selection, for several of the scenarios examined. As the DL method is generic and applicable to any affective modeling task, the key findings of the paper suggest that ad-hoc feature extraction and selection — to a lesser degree — could be bypassed.The authors would like to thank Tobias Mahlmann for his work on the development and administration of the cluster used to run the experiments. Special thanks for proofreading goes to Yana Knight. Thanks also go to the Theano development team, to all participants in our experiments, and to Ubisoft, NSERC and Canada Research Chairs for funding. This work is funded, in part, by the ILearnRW (project no: 318803) and the C2Learn (project no. 318480) FP7 ICT EU projects.peer-reviewe

    Don't classify ratings of affect ; rank them!

    Get PDF
    How should affect be appropriately annotated and how should machine learning best be employed to map manifestations of affect to affect annotations? What is the use of ratings of affect for the study of affective computing and how should we treat them? These are the key questions this paper attempts to address by investigating the impact of dissimilar representations of annotated affect on the efficacy of affect modelling. In particular, we compare several different binary-class and pairwise preference representations for automatically learning from ratings of affect. The representations are compared and tested on three datasets: one synthetic dataset (testing “in vitro”) and two affective datasets (testing “in vivo”). The synthetic dataset couples a number of attributes with generated rating values. The two affective datasets contain physiological and contextual user attributes, and speech attributes, respectively; these attributes are coupled with ratings of various affective and cognitive states. The main results of the paper suggest that ratings (when used) should be naturally transformed to ordinal (ranked) representations for obtaining more reliable and generalisable models of affect. The findings of this paper have a direct impact on affect annotation and modelling research but, most importantly, challenge the traditional state-of-practice in affective computing and psychometrics at large.peer-reviewe

    Analysing the relevance of experience partitions to the prediction of players’ self-reports of affect

    Get PDF
    A common practice in modeling affect from physiological signals consists of reducing the signals to a set of statistical features that feed predictors of self-reported emotions. This paper analyses the impact of various time-windows, used for the extraction of physiological features, to the accuracy of affective models of players in a simple 3D game. Results show that the signals recorded in the central part of a short gaming experience contain more relevant information to the prediction of positive affective states than the starting and ending parts while the relevant information to predict anxiety and frustration appear not to be localized in a specific time interval but rather dependent on particular game stimuli.peer-reviewe

    Deep multimodal fusion : combining discrete events and continuous signals

    Get PDF
    Multimodal datasets often feature a combination of continuous signals and a series of discrete events. For instance, when studying human behaviour it is common to annotate actions performed by the participant over several other modalities such as video recordings of the face or physiological signals. These events are nominal, not frequent and are not sampled at a continuous rate while signals are numeric and often sampled at short fixed intervals. This fundamentally different nature complicates the analysis of the relation among these modalities which is often studied after each modality has been summarised or reduced. This paper investigates a novel approach to model the relation between such modality types bypassing the need for summarising each modality independently of each other. For that purpose, we introduce a deep learning model based on convolutional neural networks that is adapted to process multiple modalities at different time resolutions we name deep multimodal fusion. Furthermore, we introduce and compare three alternative methods (convolution, training and pooling fusion) to integrate sequences of events with continuous signals within this model. We evaluate deep multimodal fusion using a game user dataset where player physiological signals are recorded in parallel with game events. Results suggest that the proposed architecture can appropriately capture multimodal information as it yields higher prediction accuracies compared to single-modality models. In addition, it appears that pooling fusion, based on a novel filter-pooling method provides the more effective fusion approach for the investigated types of data.peer-reviewe

    Mining multimodal sequential patterns : a case study on affect detection

    Get PDF
    Temporal data from multimodal interaction such as speech and bio-signals cannot be easily analysed without a preprocessing phase through which some key characteristics of the signals are extracted. Typically, standard statistical signal features such as average values are calculated prior to the analysis and, subsequently, are presented either to a multimodal fusion mechanism or a computational model of the interaction. This paper proposes a feature extraction methodology which is based on frequent sequence mining within and across multiple modalities of user input. The proposed method is applied for the fusion of physiological signals and gameplay information in a game survey dataset. The obtained sequences are analysed and used as predictors of user affect resulting in computational models of equal or higher accuracy compared to the models built on standard statistical features.peer-reviewe

    Multimodal ptsd characterization via the startlemart game

    Get PDF
    Computer games have recently shown promise as a diagnostic and treatment tool for psychiatric rehabilitation. This paper examines the potential of combining multiple modalities for detecting affective responses of patients interacting with a simulation built on game technology, aimed at the treatment of mental diagnoses such as Post Traumatic Stress Disorder (PTSD). For that purpose, we couple game design and game technology to create a game-based tool for exposure therapy and stress inoculation training that utilizes stress detection for the automatic profiling and potential personalization of PTSD treatments. The PTSD treatment game we designed forces the player to go through various stressful experiences while a stress detection mechanism profiles the severity and type of PTSD by analyzing the physiological responses to those in-game stress elicitors in two separate modalities: skin conductance (SC) and blood volume pulse (BVP). SC is often used to monitor stress as it is connected to the activation of the sympathetic nervous system (SNS). By including BVP into the model we introduce information about para-sympathetic activation, which offers a more complete view of the psycho-physiological experience of the player; in addition, as BVP is also modulated by SNS, a multimodal model should be more robust to changes in each modality due to particular drugs or day-to-day bodily changes. Overall, the study and analysis of 14 PTSD-diagnosed veteran soldiers presented in this paper reveals correspondence between diagnostic standard measures of PTSD severity and SC and BVP responsiveness and feature combinations thereof. The study also reveals that these features are significantly correlated with subjective evaluations of the stressfulness of experiences, represented as pairwise preferences. More importantly, the results presented here demonstrate that using the modalities of skin conductance and blood volume pulse captures a more nuanced representation of player stress responses than using skin conductance alone. We conclude that the results support the use of the simulation as a relevant treatment tool for stress inoculation training, and suggest the feasibility of using such a tool to profile PTSD patients. The use of multiple modalities appears to be key for an accurate profiling, although further research and analysis are required to identify the most relevant physiological features for capturing user stress.peer-reviewe

    Validating generic metrics of fairness in game-based resource allocation scenarios with crowdsourced annotations

    Get PDF
    Being able to effectively measure the notion of fairness is of vital importance as it can provide insight into the formation and evolution of complex patterns and phenomena, such as social preferences, collaboration, group structures and social conflicts. This paper presents a comparative study for quantitatively modelling the notion of fairness in one-to-many resource allocation scenarios - i.e. one provider agent has to allocate resources to multiple receiver agents. For this purpose, we investigate the efficacy of six metrics and cross-validate them on crowdsourced human ranks of fairness annotated through a computer game implementation of the one-to-many resource allocation scenario. Four of the fairness metrics examined are well-established metrics of data dispersion, namely standard deviation, normalised entropy, the Gini coefficient and the fairness index. The fifth metric, proposed by the authors, is an ad-hoc context-based measure which is based on key aspects of distribution strategies. The sixth metric, finally, is machine learned via ranking support vector machines (SVMs) on the crowdsourced human perceptions of fairness. Results suggest that all ad-hoc designed metrics correlate well with the human notion of fairness, and the context-based metrics we propose appear to have a predictability advantage over the other ad-hoc metrics. On the other hand, the normalised entropy and fairness index metrics appear to be the most expressive and generic for measuring fairness for the scenario adopted in this study and beyond. The SVM model can automatically model fairness more accurately than any ad-hoc metric examined (with an accuracy of 81.86%) but it is limited by its expressivity and generalisability.Being able to effectively measure the notion of fairness is of vital importance as it can provide insight into the formation and evolution of complex patterns and phenomena, such as social preferences, collaboration, group structures and social conflicts. This paper presents a comparative study for quantitatively modelling the notion of fairness in one-to-many resource allocation scenarios - i.e. one provider agent has to allocate resources to multiple receiver agents. For this purpose, we investigate the efficacy of six metrics and cross-validate them on crowdsourced human ranks of fairness annotated through a computer game implementation of the one-to-many resource allocation scenario. Four of the fairness metrics examined are well-established metrics of data dispersion, namely standard deviation, normalised entropy, the Gini coefficient and the fairness index. The fifth metric, proposed by the authors, is an ad-hoc context-based measure which is based on key aspects of distribution strategies. The sixth metric, finally, is machine learned via ranking support vector machines (SVMs) on the crowdsourced human perceptions of fairness. Results suggest that all ad-hoc designed metrics correlate well with the human notion of fairness, and the context-based metrics we propose appear to have a predictability advantage over the other ad-hoc metrics. On the other hand, the normalised entropy and fairness index metrics appear to be the most expressive and generic for measuring fairness for the scenario adopted in this study and beyond. The SVM model can automatically model fairness more accurately than any ad-hoc metric examined (with an accuracy of 81.86%) but it is limited by its expressivity and generalisability.peer-reviewe

    Genetic search feature selection for affective modeling : a case study on reported preferences

    Get PDF
    Automatic feature selection is a critical step towards the generation of successful computational models of affect. This paper presents a genetic search-based feature selection method which is developed as a global-search algorithm for improving the accuracy of the affective models built. The method is tested and compared against sequential forward feature selection and random search in a dataset derived from a game survey experiment which contains bimodal input features (physiological and gameplay) and expressed pairwise preferences of affect. Results suggest that the proposed method is capable of picking subsets of features that generate more accurate affective models.peer-reviewe

    Generic physiological features as predictors of player experience

    Get PDF
    This paper examines the generality of features extracted from heart rate (HR) and skin conductance (SC) signals as predictors of self-reported player affect expressed as pairwise preferences. Artificial neural networks are trained to accurately map physiological features to expressed affect in two dissimilar and independent game surveys. The performance of the obtained affective models which are trained on one game is tested on the unseen physiological and self-reported data of the other game. Results in this early study suggest that there exist features of HR and SC such as average HR and one and two-step SC variation that are able to predict affective states across games of different genre and dissimilar game mechanics.peer-reviewe
    • …
    corecore